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QUANTUM BILINEAR OPTIMIZATION 


MARIO BERTA, OMAR FAWZI, AND VOLKHER B. SCHOLZ 


Abstract. We study optimization programs given by a bilinear form over non-commutative variables 
subject to linear inequalities. Problems of this form include the entangled value of two-prover games, 
entanglement-assisted coding for classical channels and quantum-proof randomness extractors. We 
introduce an asymptotically converging hierarchy of efficiently computable semidefinite programming 
(SDP) relaxations for this quantum optimization. This allows us to give upper bounds on the quantum 
advantage for all of these problems. Compared to previous work of Pironio, Navascues and Acm, our 
hierarchy has additional constraints. By means of examples, we illustrate the importance of these new 
constraints both in practice and for analytical properties. Moreover, this allows us to give a hierarchy 
of SDP outer approximations for the completely positive semidefinite cone introduced by Laurent and 
Piovesan. 


1. Introduction 


1.1. Setting. A major goal in quantum information theory is to understand the advantage over 
classical protocols that can be achieved by allowing quantum protocols. For a given information 
processing task, identifying the optimal success rate for this task can be seen as an optimization over 
the set of valid protocols. The quantum advantage is then defined as the increase in the optimal 
value by allowing a larger set of protocols that make use of quantum theory. A family of tasks for 
which such an advantage is very well-studied is the family of games between multiple parties that 
are not allowed to communicate. As was first demonstrated by Bell [3, 17], there exist games for 
which entanglement between the players can increase the success probability beyond the ultimate 
limit of classical protocols. The fundamental limit for classical protocols is called a Bell inequality 
and its violation indicates an important feature of quantum theory called non-locality. The topic of 
non-locality has been a very active topic in quantum information theory and in the foundations of 
quantum mechanics; see [11] for a review. A quantum advantage can also be studied in many other 
settings including communication complexity [12], communication over a classical channel [21, 49] or 
randomness extractors [50]. One objective of this paper is to formulate many of these problems in a 
unified language as bilinear optimization programs. 

To make the discussion more concrete, we consider a specific example. Let Wx^y be a noisy 
channel mapping system X to system Y. Assuming X and Y are discrete systems, we can describe 
the channel by the transition probabilities Wx^Y{y\x) from x to y for all {x,y) ^ X x Y. The 
goal is to send k bits of information using this channel while minimizing the error probability for the 
decoding. A valid protocol in this setting is given by an encoding function e : [2^] —)• X and a decoding 
function d :Y ^ [2^]. To take into account the possibility of a randomized functions, we describe the 
encoder by a probability distribution {e{x\i)'\x on X for every possible input i € [2^], and similarly 
the decoder by a distribution {d{i\y)'\i on [2^] for every y gY. Given an encoder and a decoder, the 
average success probability of our protocol can be expressed as y ^d{i\y)Wx^Y{y\x)e{x\i). In 

summary, optimizing the success probability of information transmission is captured by the following 
bilinear program 


( 1 ) 


maximize 

(e,d) 

subject to 


2k E Wx^Y{y\x)d{i\y)e{x\i) 

x,y,i 

'^^e{x\i) = 1 Vi G [2^] 

X 


Y^d{i\y) = l yyeY 

i 

0<e(a:|z)<l V(a:, i) G A x [2^] 

0<d(i|y)<l V(i, 2 /) G [2^=] xT. 
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Observe that allowing the encoder and the decoder to access (unlimited) shared randomness does not 
change the optimal value of this program. A fundamental question is to study the effected of shared 
entanglement for communication. It is not possible to communicate only using shared entanglement 
between the sender and the receiver. However, shared entanglement can offer important advantages 
for communication if we already have a quantum channel [5, 7, 6] or even a classical channel [21, 49]. 
The latter is the setting we consider here. A quantum protocol is described by a Hilbert space % (of 
arbitrary dimension), a unit vector (called state) ji/') ^ shared between the encoder and the 

decoder, and positive operator-valued measures on T-L for the encoder {E{x\i)'\x for each i £ [2^] and 
for the decoder {D{i\y)'\i for each y gY. In the quantum setting, optimizing the success probability 
for transmitting k bits is given by 


( 2 ) 


maximize 
subject to 


^ ^ ^x^Y{y\x){ip\E{x\i) (g) D{i\y)\'il;) 

x,y,i 

'^E{x\i) =\d'H Vi £[2^1 

X 


Y^D{i\y) = idu 

i 

0 ^ E{x\i) < id'n 
0 ^ D{i\y) Y id^ 


yyGY 

V(x,i) £ A X [2^=] 
V(i,y)£[2'^]xy. 


Here, {'ip\ is the conjugate transpose of the vector and we write D Y E the operator E — D is 
positive semidefinite. As we can always take T-L = C, any feasible solution for (1) is also a feasible 
solution for (2). 

Allowing for quantum protocols also leads to the definition of quantum graph parameters [16, 51, 40, 
13]. For example, the stability number of a graph G can be viewed in terms of the success probability of 
a two-prover game depending on G, or in terms of the success probability for information transmission 
over a noisy channel defined by G. Allowing quantum protocols in these tasks naturally leads to 
the definition of quantum stability numbers of a graph. To study such quantum graph parameters, 
Laurent and Piovesan [40] recently introduced a non-commutative analog of the completely positive 
cone CT’ called the completely positive semidefinite cone C5+. For the aforementioned problems, the 
set of quantum strategies can then be described using C5+, and the quantum advantage is witnessed 
by the fact CS+ is larger than CT’. 

Having phrased the setup, let us now give a short overview of our hndings. 


1.2. Results. We start by phrasing problems like the ones stated above as optimization programs. 
More precisely, we study the class of tasks that can be described by optimizing a bilinear function 
subject to linear inequalities. The optimization over classical protocols corresponds to a program 
similar to (1) with commutative (scalar) variables, whereas the optimization over quantum protocols 
corresponds to allowing the variables to be operator-valued as in (2). As it appears from the expression, 
optimization over quantum protocol seems quite complicated. In fact, as there is no bound on the 
dimension of the Hilbert space, it is not known whether the optimal value is even computable. In 
the context of games, Navascues, Pironio and Acfn (NPA) [41] introduced a family of semidefinite 
programming (SDP) relaxations that give efficiently computable upper bounds on quantum bilinear 
programs. This hierarchy was shown to asymptotically converge to the optimal quantum protocol [42, 
47, 22]. These hierarchies can be seen as non-commutative versions of the sum-of-squares hierarchies 
introduced by Lasserre and Parrilo [39, 46]. 

Our first contribution is the observation that many information processing tasks can be formulated 
in this way. We believe that phrasing these seemingly different problems in a unihed language will 
help in our understanding of each one of these problems. Moreover, we think that tools developed in 
the context of optimization should be valuable in characterizing the power and limitations of quantum 
protocols. Our second contribution is to give a new hierarchy of SDPs that gives upper bounds on 
quantum bilinear programs. Compared to the previous contributions [41, 42, 47, 22], our hierarchy 
has some additional constraints which we illustrate to be useful in several settings. For example, the 
first level of our hierarchy has the nice property of being naturally bounded by the maximal value 
of the general problem, i.e., it is bounded by one for the case of channel coding discussed above. In 
addition, by means of a specific example, we show that our SDPs can give better bounds in practice. 
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The new constraints are also important to study the completely positive semidefinite cone C5+, which 
consists of all the symmetric matrices that admit a Gram representation by positive semidefinite 
matrices of any size. In fact, we show that these constraints lead to a natural hierarchy of SDP outer 
approximations for the completely positive semidefinite cone C5+. 

1.3. Organization of the Paper. In Section 2, we introduce the general setup of quantum bilinear 
optimization and present our new hierarchy of SDPs. We keep the main text elementary and only 
prove that our SDPs give upper bounds on the quantum programs when the Hilbert space is finite¬ 
dimensional (the infinite-dimensional case as well as the convergence of the hierarchy are deferred to 
appendices). In Section 3, we describe applications to two-prover games, channel coding, randomness 
extractors as well as to the optimization over the completely positive semidefinite cone. 

2. Bilinear Optimization 

2.1. Setup. As motivated in (1) we would like to start from the following type of (classical) bilinear 
optimization program with real variables Za for a € [A^] := {!,..., A^} and yp for /3 £ \M\ := 


( 3 ) 


p[A,Q,]C\ := maximize > A, 

{ZcVA 


a,/3 


oL^p^a. yp 


subject to g{zi,... ,zn) >0 Vg E G 
k{yi, .. ■,yM) >0 V/c £ /C . 

with sets of affine constraints G ■= {g{zi-, . .., zm)} and K, := {k{yi, ..., i/m)}, where 

(4) g{zi,...,ZN) ■■= g^ + g^'za and k{yi,... ,yM) ■= k° + ^ k^yp. 

aelAi] /3 g[M] 

For convenience we also define the complete set of constraints 

( 5 ) T:=GUICU{1} 

where 1 is the function always equal to 1. Moreover, call 

( 6 ) pIA,^] := p[A,G,K:] 

the classical value of (3). We restrict ourselves to affine constraints as all our applications have this 
form. It is however possible to extend the approach to polynomial equality constraints and have a 
linear term in the objective function, see Appendix C. 

In analogy to (2) the corresponding quantum bilinear optimization program of (3) is then as follows. 


Let Tihea Hilbert space (of arbitrary dimension), Itjj) £ Ti. with 


= 1, and let Ea, Dp be Hermitian 


p*[A, ^,/Cl := maximize 
subject to 


operators in the algebra B{'H) of bounded linear operators on T-L. By substituting the variables Za in 
the linear constraints with operators Ea (and similarly for yp with Dp) we set 

Y,Ao.MEaDp\i;) 

a.,(3 

(7) subject to [Ea,Dp] = 0 £ [A^] x [M] 

g{Ei,... ,En) h 0 MgeG 
k{Di, ..., Dm) ^ 0 \/k £ 1C , 

where [Ea,Dp] := E^Dp — DpE^ denotes the commutator, and g{Ei,... ,E]\[) ^ 0 means that the 
operator g{Ei,..., E]\f) is positive semidefinite (and similarly for k{Di,... ,Dm) ^ 0). We note 
that we do not think of the commutation conditions [Ea,Dp\ = 0 V(a,/3) £ [A^] x [M] as being 
constraints, but rather being part of the “quantization procedure” itself. This is motivated by our 
examples originating from information theory, and the commutation relations naturally lead to their 
quantum versions. Moreover, from now on we assume that the sets of constraints G-, J~ satisfy the 
following. 

Assumption 2.1. The set of constraints G, D imply that there exists a positive constant C > 0 such 
that the relations —Cl ^ E^ ^ Cl and —Cl ^ Dp < Cl hold for all (a,/3) £ [A^] x [M]. Moreover, 
all operators denoted by Ea and Dp are assumed to be self-adjoint. 


^Here and henceforth we write maximize for taking the supremum (in particular the maximum might not be attained). 
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We note that the Assumption above implies that the operator valued variables are always bounded 
operators, as the relations above together with the assumption of self-adjointness imply H-EqH, ||A)^|| < 

C. 

In the following we call 

(8) p*[A,F]:=p*[A,Q,lC] 

the quantum value of (7), with the total set of constraints F as in (5). Clearly the quantum value is 
never smaller than the classical value, 

(9) p[A,F]<p*[A,F]. 

Note that compared to the entanglement-assisted channel coding example (2) we do not assume that 
the Hilbert space % has tensor product form with Ea acting on the first factor and acting on 
the second factor, but only that Ea and Dp commute. This takes into account the most general 
formulation of quantum mechanics [29] (see also [10] for a quantum information theory reference). 
However, for every feasible solution of (7) corresponding to a finite-dimensional Hilbert space, we 
can assume that the Hilbert space has a tensor product structure Ti <S>'H with operators Ea <8) 1 and 
1 (g) Dp (instead of just [Ea,Dp] = 0 on a single space 7i)] see e.g., [54, Chapter 5] or for a self- 
contained quantum information theory reference [52]. Moreover, for the general infinite-dimensional 
case the optimal value of (2) is certainly upper bounded by the optimal value of the corresponding 
program (7). 

Remark 2.2. Provided Cannes’ embedding conjecture has a positive answer [20], we can restrict the 
optimization in (7) to finite-dimensional Hilbert spaees (and thus of tensor product form). This was 
proved for the special case of bipartite games in [32, 25, 44]. For a proof sketch for the general case 
see Appendix B. 

Our ultimate goal is to understand the gap between the classical value p[A,F] and the quantum 
value p*[A,F] for operational examples of interest. For the problems that we study in this paper 
p[A,F] is typically understood but estimating p*[A,F] is the challenge. Lower bounds on p*[A,F] 
can then be found by any feasible solution of (7) but upper bounds are harder to find (basically 
because the optimization in (7) is over Hilbert spaces of unbounded dimension). Building on the 
works of Navascues, Pironio and Acm [41, 42] and Doherty, Liang, Toner and Wehner [22] in the 
context of games, Pironio, Navascues and Acm [47] gave asymptotically converging hierarchies of SDP 
relaxations for general quantum polynomial optimization (see [26] for an operator algebra point of 
view on this hierarchy). We briefly sketch their results when applied to our more specihc setting of 
quantum bilinear optimization as in (7). 

2.2. Generating upper bounds. This section mainly serves motivational purposes. As our goal is 
to derive semidefinite program relaxations of (7), we first outline a simplified analysis which will lead 
to upper bounds. These are then identified to be equal to the levels in the hierarchy of Navascues, 
Pironio and Acm. The precise connection is briefly explained in the next section. We do not provide 
proofs, and defer the reader to the original papers [41, 42] for more details. 

We first introduce some notation. Let Sqo denote the free complex *-algebra generated by the 
N + M symbols 

( 10 ) ■ 

In other words, these are the non-commutative polynomials in the variables z, y. The monomials of 
Soo are also called words and can be indexed by a u = {ui, ..., u^) with Ui G {1,..., -|- M}. For 

example, the monomial Xu indexed by u = (1,3, 3, -|- 2) is dehned as Xu = ziz‘ly 2 . The degree of a 
monomial Xu, which is also called the length of the word is denoted i{u). The unit monomial 3:0 is 
called the empty word indexed by 0, and has length zero. Words Xu,x^ are concatenated as 

(11) XuOXy:=Xuov with u o u := (ui,..., ui,..., U£(„)). 

The algebra Sqo also caries a natural involution * : Sqo Sqo reversing the order of words with 

(12) < := Xu* with u* := (u£(„),..., ui), 


QUANTUM BILINEAR OPTIMIZATION 


5 


and being the complex conjugation for complex scalars. For a fixed integer n E N, the set of words 
(monomials) of length up to re, i{w) < re, spans a vector space of dimension 

{N + M)”+i - 1 


(13) 


d{n) := 


N + M -1 


Now for every feasible solution (T-L,'ip, E^, D^) of (7), we define the linear form 

(14) uj : Soo ^ C with Cj{u) := {ijj\Xu\ip ), 


where stands for the explicit representation of the word Xu in terms of the operators and 
for the symbols Za and respectively. Next, we choose re € N and consider the d{n) x d{n) matrix 
labeled by words re, n of length re 

(15) Q := Qu,vW){v\ with entries Qu,v ■= ii^lXu-Xylip) . 

U,v£T,n 

Here |re)(re| refers to the matrix with all zero entries except for the entry labeled (re, re) which is equal 
to 1. This matrix is positive semidehnite since it is the Gram matrix of the vectors Moreover, 

the linear constraints f € E generate d{n — 1) x d{n — 1) matrices 

N+M 

(16) ^[f] ■■= f Y ^u,i{)ov\u){v\ 

2=0 U,VG'En — l 

that are positive semidehnite as well (where (i) indexes words of length one: the i-ih. symbol). For the 
commutativity constraints between E^ and Dj^, this can be simply captured by identifying words re ~ re 
if re can be obtained from re by using commutation between Za and yp. For example, ziy^z"^ ~ 
Restricting in (15) and (16) to constraints that only involve words up to length re dehnes a hierarchy 
of semi-dehnite program relaxations. In more detail, for any re > 1 

fi"' E Pos(d(re)) 

110,0 = 1 

^Z,v*ow = Kou,w Vre, re, rre E : re O re E re O re; E 
'^'^1 G : re ~ re ', re ~ re ' 

AT+M 

!!"[/]:= E/* E K{^o>){v\€Vo^{d{n-l)) V/E.F, 

2=0 22,2;GSn—1 

where Pos(d(re)) denotes the set of positive semidehnite matrices of size d{n) as in (13), and we have the 
total set of constraints E as in (5). It now turns out by comparison to [41, 42] that the programs (17) 
match exactly the semidehnite relaxations derived by Navascues, Pironia and Acm. 


sdp„[A, T"] := maximize 
subject to 


(17) 


2.3. NPA hierarchy. In the optimization literature the matrices H"' appearing in the program (17) 
are called moment matrices while the matrices n"'[/] are called localizing matrices. However, the 
program (17) is not derived as presented above, but by introducing dual variables of the optimization 
problem (7), which then can be identihed with the matrices VE and !!”'[/]. In case the moment matrix 
of the optimal solution is of the form (15), then the optimal solution equals the value p*[A,E]. 

Clearly the levels of the NPA hierarchy are monotonically decreasing in the sense that for any re E N, 

(18) sdp„[A,T'] > sdp„+i[A,T'], 
and by the preceding discussion we also have 

(19) p*[A,E]<sdpJA,E]. 

The first major contribution of [22, 42, 47] was a proof that the above sequence also converges to the 
value oip*[A,E], 

(20) p*[A,E] = lim sdp„[A, T"]. 
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under the Assumption 2.1. This is achieved by showing that the quadratic module can be assumed to 
Archimedian and an explicit construction of the Hilbert space and associated operators.^ 

The first few levels of the NPA hierarchy have been used intensively in order to understand the 
separation between the classical and the quantum value of two-prover games, see e.g., [45]. In the 
following we propose an alternative SDP hierarchy. This hierarchy is not only useful for studying 
two-prover games but also for other problems like (entanglement-assisted) one-shot channel coding, 
(quantum-proof) randomness extractors, and for optimizations over the completely positive semidefi- 
nite cone. 


2.4. New Hierarchy. We use a way different from (16) for generating constraints. Instead of defining 
the NPA linear form oj as in (14) we define a bilinear form u : Sqo x Sqo —>■ C that we now describe 
for the case of finite-dimensional Hilbert spaces. The general case can be found in Appendix A. Now 
as stated above, for finite-dimensions we can assume that the non-commutative optimization in (7) 
is over tensor product Hilbert spaces % ® % with operators Ea ® 1 and 1 0 Dp (instead of just 
[Ea, Dp] = 0 on a single space 7i). We start with any feasible solution {T-L 0 Ti, "0) <8) 1,1 <8) Dp) 

where again the operators E^ are explicit representations of the symbols Za and the operators Dp 
are explicit representations of the symbols yp. Taking the partial trace over the second space T-L, we 
denote 


(21) (T := Tr-^ [|'0)('0|] := ^ (I® (i|)|V’)(V’l(l® N)) and write ji/^) = ^17 ® |<I)) , 

i 

where |<I>) := |f)|i) for some orthonormal basis {|i)} of T-L and a unitary U. The objective function 

of the quantum bilinear optimization program (7) can then be rewritten as 


( 22 ) 

(23) 


® Dpl^P) = '^AaA^\UEaU^ ® Dpa^/^)^ 

q ,/9 a,l3 


= ^«>/3Tr 

Q!,/3 


tJElu'^a^/^Dpa^/^ 


where E"^ denotes the transpose of the operator E and U is the complex conjugate of U in the basis 
{|f)} of Ti. We note that the transpose as well as the conjugation by unitary operators preserve our 
constraints, and hence may be just absorbed in the operators E^, as we maximize over them. Hence, 
we get the following alternative form of (7), 


(24) 


P*[A,g,}C] 


maximize 

('H,cr,Ea,Dp) 


subject to 


^a„aTr 

a,y 


E^a^/^Dpa^/^ 


cj ^ 0, Tr[(j] = 1 


9{Ei,...,En) '^0 'igGG 
k{Di,..., Dm) ^ 0 \/k £ TC , 


under the assumption that Ti is finite-dimensional (see Appendix A for the general case). Now, for 
fixed a we define the bilinear form 




(25) uj : Soo X Soo —)■ C with uj{u,v) := Tr 
Similarly as for NPA we look at the (infinite-dimensional) matrix 

(26) n := nu^„|u)(u| with entries := ti;(u*, u) = Tr 


u,v 

and find that it is positive semidefinite. However, the bilinear form (25) gives us even more structure. 
Namely we can say that the reordered (infinite-dimensional) matrix 


(27) 


n[l,l]:= gi-s»ot,u»ov\s){t\0\u){v 

s,t,u,v 


^For a given set of constraints E, the quadratic module is the set of polynomials TfEoo) with variables in Eoo which 
are of the form 57; “1®* + Kjfibij for ai, bij G T(Eoo). It is called Archimedian, if there exists a constant C > 0 
such that the polynomial i® element. Note that we again assumed that the free variables ui,. .. ,ui are 

hermitian. 
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is positive semidefinite as well. To see this, take a vector \(j)) = Cs,n|s)|tt). Then, we have 


(28) 

(29) 

(30) 




^ ^ Cs,uCt,v'^^ 
s,t,u,v 

s,t,u,v 




= Tr 


= Tr 



Y,ct^^X,a^/^Xt^ 

t,V / 

Y^Cs^uXuO^I’^X, 


> 0 . 


More generally, any pair of linear constraints f,f^X from (5) generate (infinite-dimensional) matrices 

Af+M 

(31) n[fj]:= ^ Pp ^ llr*o(*)os,«*o0)o^k)(s| ® 

i,j=0 r,s,u,v 

that are positive semidefinite by the same argument as in (28)-(30). Now, restricting in (26) and (31) 
to constraints that only involve words up to length n defines the n-th level of our new hierarchy. The 
variable we optimize over is now a matrix 12"' whose rows and columns are indexed by words of length 
at most n. That is, for n odd we dehne 

(32) 

sdp„[^, T] := mammize ^ 

a.,(3 

subject to 12" G Pos(d(n)) 

^0,0 = 1 

Af+M 

i,j=0 r,s,M,'!)eS(„_i)/2 

GPos(d2(n-l)) yfJ^F. 


Note that the third constraints of the form 12" [1,/] ^ 0 correspond to constraints 12"[/] ^ 0 in the 
NPA hierarchy as in (17). For n > 2 even, we replace the last constraint in (32) with the following 
constraints where n' := (n — 2)/2: 


(33) 


^"[l^ 1] := X] ^r*os,«*o,;k)(s| ® l«)(^l G Pos(^d(n/2)d(n/2)) 

«>'»GS„/2 

AT+M 

f ' ^r*o(i}os,u*ov\PP\^\u){v\ ^'Pos(d{n')d{n/2)^ V/G 

i=0 r,seT.^, 

U,V&T,ri/2 

N+M 

i=0 r,s&T,^/ 

u,vG'E^, 


In accordance with the literature we call the matrices 12" moment matrices and the matrices 12" [/, /], 
12"[/] localizing matrices. Clearly the levels of this new hierarchy are monotonically decreasing in the 
sense that for any n G N, 

(34) sdp„ [T, F] > sdp„+i [A, F]. 

We note that the SDPs we derive correspond in the special case where {ip) is restricted to be a 
maximally entangled state on 72(8)”22, or equivalently a to be maximally mixed, to the SDP relaxations 
proposed in [38]. Such relaxations were also used for verifying experimental hndings [19]. 
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The following theorem summarizes the relationship between p*[A,T] and the sequence of SDPs 
sdp[A, F], 

Theorem 2.3. Using the notation in this section, we have for all n > 1, 

(35) p*[A,F]<sdpJA,F]. 

Moreover, under the Assumption 2.1 we have 

(36) p*\A,F] = lim sdp„[^,J']. 

n^oo 

Proof. The inequality (35) was proved above for finite-dimensional Hilbert spaces. For the general case 
see Appendix A.l. For (36), a self-contained proof can be found in Appendix A.2. The convergence 
also follows from the convergence of the NPA hierarchy (20) together with Proposition 2.4. □ 

We now discuss the first level relaxation of our new hierarchy (32) in more detail. 


2.5. First Level Relaxation. For applications the first level relaxation often already gives good 
bounds. We find 


(37) 


sdpi[A,W] 


= maximize 


subject to 


o,/3 

G Fos{l +N + M) 

^ 0,0 = 1 

N+M 

*>i=o 


Compared to this, the first level relaxation of the NPA hierarchy (17) gives 


(38) 


sdp^ [A, F] = maximize 
subject to 


q:,/9 

12^ ePos{l + N + M) 

^ 0,0 = 1 

N+M 

E/‘«k»S0 V/6.F. 

i=0 


By inspection we find that (37) has extra constraints compared to (38). This implies in particular 
that the first level of our hierarchy is never a worse approximation than the first level of the NPA 
hierarchy, 


(39) 


sdp;^[A,W] < sdp2[A, . 

The extra conditions are of the form 



N+M 

N+M 

(40) 




i,j=0 

i,j=0 


N+M 


(41) 


yg e G, yk e K.. 


i,j=0 



We note that in many settings the constraint (41) can be inferred from the second level of the NPA 
hierarchy and hence can be added to the first NPA level as needed when evaluating examples. The 
former conditions (40) however are qualitatively different from the NPA hierarchy. We will see later 
that for certain applications and examples the additional conditions (40) are useful (Section 3). In 
the following section we compare the higher levels of the two hierarchies. 
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2.6. Relations between Hierarchies. Although a direct comparison of our new hierarchy with the 
NPA hierarchy is difficult (see the argument below) we can give the following connection. 

Proposition 2.4. As already seen in (37) and (38) we have 

(42) sdpi [A, T] < sdpi [A, T]. 


Moreover, for n > 2 we have 

(43) sdp 2 „ [A,T]< sdp„ [A,T]. 

Proof. Let be a feasible solution for sdp 2 „[A, with the even level constraints as in (33). For any 
w G S 2 n, let Wz and Wy be the subwords of w containing only symbols of type z and y respectively. 
For example, if u; = ziy‘ly 2 Z^yi, then Wz = ziz^ and Wy = 2 / 1 ^ 22/1 • 

We define for every w € S 2 n, the complex number := and let 12” „ = mu*ov for arbitrary 

words u, V of length at most n. Because of this form, it is easily seen that 12” ^* 0 ^ = 12”o„ Moreover, 
observe that if re ~ rr' then Wz = w'^ as well as Wy = Wy. It follows that 12” „ = 12”, if u ~ u' and 
u ~ uh For the positivity constraint we write 

(44) Y. 125o.„«;o.J^)(^|. 

U,v£Sri U,v£jln 


This matrix is a principal sub-matrix of the matrix 

(«) E \s){t\ (g) |u)(u|, 

S,t^U,VGT,n 


by only considering rows corresponding to t and s being words with only symbols of type z, and u 
and V being words with only symbols of type y, and also such that i{s o u),i{t o v) < n. As a result 
12” E 0. For the constraints g € G, we have 


(46) 


Es' E 

i 1 


Yg' Y ^Koii)ov.,u^^oVy\u){v\, 


which again is a positive semidefinite matrix as it is a principal sub-matrix of 12^” [g']. The positivity 
of 12” [A:] for A: G /C is similar. □ 


This proposition implies in particular that the convergence of the new hierarchy sdp„[A, already 
follows from the convergence of the NPA hierarchy sdp„[A, (see Appendix A.2 for a direct proof). 
We leave it as an open question if the comparison sdp;^ [A, T] < sdp^ [A, T] for the first level is special 
or if we might even have sdp„[A, T] < sdp„[A, T] in general. We emphasize that it is unfair to directly 
compare the SDPs sdp„[A, J^] and sdp„[A, as our program can have more variables. In fact, if we 
take into account the commutation relations in the NPA program (17), the variable 12” is effectively 
smaller than the matrix 12” for our new relaxation (32), and even more so for large n. 


3. Applications 

3.1. Two-Prover Games. In a two-prover game, each player (or prover) gets asked a question by 
the referee: qi G Qi for the first player and q 2 G Q 2 for the second player. Each player is then asked to 
provide an answer ai G Ai and 02 G A 2 . The referee, looking at the questions and answers qi,q 2 ,ai, 02 
decides whether the players win or lose the game according to a function V : A 1 XA 2 XQ 1 XQ 2 ^ {0,1}. 
The players may use any agreed upon protocol but they cannot communicate once they have received 
the questions. The fundamental quantity of interest given such a game is the largest probability of 
success that the players can achieve. The study of multi-prover games was introduced in [4] and has 
played a major role in theoretical computer science [1]. It also provides a very nice interpretation for 
understanding non-local correlations that can be obtained by measuring an entangled state [18]. The 
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value of a game defined by the verification predicate V and a distribution vr is given by 


(47) 


uj(V,Tr) := maximize 

le,d) 

subject to 


^ ^{(^i^a2,qi,q2)e{ai\qi)d{a2\q2) 

qi,q2 ai,a2 

^e(ai|gi) = l \/qi G Qi 

ai 

^^( 02 ^ 2 ) = ! Vg2 e <52 

a2 

0 < e(ai|g'i) < 1 \/{ai,qi) G Ai x Qi 
0 < d{a 2 \q 2 ) ^ 1 '^(o 2 j 92 ) G ^2 x Q 2 ■ 


In the notation of (3), we have N = |(5i||^i|, M = |(52||^2|; a G Qi x Ai and P G Q 2 x A 2 . The 
matrix specifying the objective function is given by 

(48) ^{qi,ai),{g 2 ,a 2 ) = {Qi, Q 2 )V(ai, a 2 , qi, q 2 ) ■ 

The constraints functions T are the positivity and normalization conditions. When the players are 
allowed to share entanglement (of arbitrary dimension), then we define the entangled value of the 
game as 


(49) 


a;*(I7, tt) := maximize 
subject to 


X] ^(«i>«2,gi,g2)(V'|T^(aiki)T>(a2|g2)|'0) 

91,92 


[£’(ai|gi),T)(a2|g2)] 

^£;(ai|gi) = id^^ 

ai 

^£>( 021 ^ 2 ) = 

0.2 

0 ^ E{ai\qi) ^ id^^ 
0 ^ D{a2\q2) A id-H 


— 0, Vai, a 2 ,qi,q 2 G Ai x A 2 x Qi x Q 2 
Vgi G Qi 

yq2 G Q 2 

V(ai,gi) G Aix Qi 
V(a2! q2) G A 2 X Q 2 ■ 


Using the procedure described in Section 2, we can define a sequence of SDPs (U, vr) that are 
upper bounds on io*{V, it). In particular, for re = 1, the SDP reads 

V{ai,a2,qi,q2)^{qi,ai),{q2,a2) 

ai,a2 

gPos{i + \Qi\\a^\ + \Q2\\A2\) 

^ 0,0 ~ ^ 

Y ^( 9 i,ai),« = ^l,u Vq'i G Qi, re G Si 

ai 

Y ^(92,a2),« = ^l,u W2 G Q 2 ,U G Si 
02 

> 0 Vre, 1 ; G Si. 

We have that the boundedness condition from Assumption 2.1 is fulhlled by the last two constraints 
in (49). Compared to the first level of the NPA hierarchy, the additional constraint is the last one, 
namely that all the matrix entries are non-negative. Note that for the special case of two-prover games 
the NPA hierarchy would explicitly encode the fact that we can assume that the operators E{ai\qi) 
and D[a 2 \q 2 ) dehne projective measurements [42]. This is done by adding some relations in the algebra 
Soo: one would add the relation,^ 

(51) (gi,ai) o (gi,a') = (5„.=„'(gj,ai) for zG{1,2}, 

and this decreases the number of words to be considered. Using this property together with the 
second level of the NPA hierarchy, one could then add to the hrst level of NPA the constraint that 


t(j®‘^Pi(U, vr) := maximize 
01 

subject to 


(50) 


■^We could easily add this property as well, but we choose not to do it to simplify the exposition. 
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the off-diagonal blocks of the matrix only have non-negative elements: 

(52) ^(gi,ai),(q 2 ,a 2 ) ^ 0 foi" (g'i,oi) G Qi X and (g' 2 ,a 2 ) G <^2 X ^2 • 

The SDP with these non-negativity constraints for the off-diagonal blocks also appeared in the con¬ 
text of studying unique games in [35] (see also [33] for a discussion of various SDP relaxations). The 
additional constraint in our SDP is that all the entries of the matrix are required to be non-negative. 


Independent work: Very recently and independently of our work, the preprint [53] appeared 
showing (among other things) that in the case of games, the hrst level of the NPA hierarchy can be 
strengthened by including the constraint that the matrix elements are non-negative. This strengthen¬ 
ing corresponds to as in (50). 


3.2. Noisy Channel Coding. Let us recall the setup of channel coding from the introduction. We 
have a channel mapping an element from the set X to an element of the set Y according to probabilities 
given by Wx^Y{y\x). The objective is to determine the maximum success probability for transmitting 
k bits of information using this channel. The classical version of the problem is described in (1). In 
the notation of (3), we have N = 2^|X|, M = 2^|y[, a G [2^] x X and /3 G [2^] x Y. The matrix 
specifying the objective function is given by 


(53) 


^{i,x),{j,y) = ^i=jWx^Y{y\x) . 


The constraints functions J- are the positivity and normalization conditions. Explicitly writing the 
first level SDP from (37) with some easy simplifications, we get 


(54) 


S®‘^Pi(W A:) := maximize 
subject to 


2k X!/ 


x,y,i 

G Pos(l + A:|V| +A:|y|) 

^0,0 = 1 

\ f \ = (1 Vf G 

/ ^ w.ii.x) w,\0 v 1/ 

X 


2k 


, tc G Si 


E<(^,y) = <0 VyeY,weS^ 

i 

> 0 Vu, u G Si. 

Again, the additional constraint compared to the NPA hierarchy is the last one, namely the fact that 
all the entries of are non-negative. Using this condition, we see that we have the desirable property 
that for any valid channel W and any k, 


( 55 ) s^^^^{w,k) < 1 

x,y,i 


1 


i,x 


1 


E^ ^0,(hD 

i,x 



i 


where we have used that the matrix is hermitian (which is implied by ^0). Now, as a concrete 
example for which the classical and the quantum success probabilities are different we mention the 
following setup from [49]. The objective is to send k = 1 bit over the noisy channel Zx^Y{y\x) 
represented by the input-output matrix 


(56) 


/ 1/3 1/3 0 0 \ 

0 0 1/3 1/3 

1/3 0 1/3 0 
0 1/3 0 1/3 
1/3 0 0 1/3 

\ 0 1/3 1/3 0 / 


It is shown in [49] that for this channel the classical and quantum success probability as in (1) and (2) 
respectively are separated as. 


2 + 5 

(57) S*(Z, 1) > ^-« 0.902 > 0.833 « - = S(Z, 1). 

3 6 

Moreover, it was shown in [31, 58] that the above lower bound for S*(1T, 1) is optimal as long as we 
restrict the optimization in (2) to two dimensional Hilbert spaces. 
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Implementing our first level SDP relaxation (54) using CVX for MATLAB [28, 27] gives the first 
non-trivial upper bound for the general optimization (2) leading to,^ 

(58) S"^Pi (Z, 1) « 0.908 > S* (Z, 1) > 0.902 . 

We note that the first level NBA relaxation as in (38) only gives the trivial upper bound of one. This 
is the case even when adding the constraint that the off-diagonal elements of the matrix are 

non-negative.^ In Appendix D, we show that the bound given by the S®‘^Pi(Z, 1) is in fact achievable 
with four dimensional entanglement-assistance: 

(59) S*(Z,I) > i + ^R. 0.908. 


Subsequent work: After this work was posted, a limit on the maximum advantage that can be 
obtained by using entanglement-assistance was proved in [2]. More precisely, we have that for any 
channel W and sending k bits of information, 

(60) S{W,k)>il-e-^)S*iW,k). 

3.3. Randomness Extractors. A randomness extractor is defined by a set of functions 

(61) Ex, := {/. : [2"] ^ p™] 

mapping bit strings of length n to shorter ones of length m; see [57] for a survey. As the name suggests, 
the goal is to extract (almost) perfect randomness from a weaker source of randomness. That is, given 
some distribution over bit strings of length n, by applying one of the functions chosen uniformly at 
random, we want to obtain a distribution close to the uniform one (in the total variation distance). The 
requirement is that the initial distribution contains enough randomness as measured using the min- 
entropy which is equal to minus the logarithm of the maximal entry of the probability distribution. In 
order for this procedure to work for all sources satisfying the min-entropy constraint, it can be shown 
that the minimal size of the seed d is logarithmic in n [57]. Since the total variation distance between 
two distributions can itself be written as an optimization over test functions, the performance of a 
given extractor Ext can be cast as a bilinear optimization program. The objective function in the 
general program (3) is chosen to be indexed by elements i € [2"] and pairs {s,j) G [2‘^'’'™] , 

■— ~ 2d+m ■ 

The constraints are the positivity and normalization of the input distribution Zi, as well as the min- 
entropy requirement, and the restriction to test functions as given by positive numbers y(s,j)- We 
arrive at 


(63) 


Err(Ext,A:) := maximize 


subject to 


0<Zi< 2-^ Vi G [2”] 

= 1 

i 


2m 


^iy{s,j) 


0 < yis,j) < 1 


V(s,j) G 


2^+271 


'^The code is available at http://www.omarfawzi.info. 

^Another upper bound, the so-called non-signaling success probability of the channel (56), is one as well (see [49, 31, 58] 
for details). 
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Here, the parameter k measures the amount of initial min-entropy. As discussed before, the constraints 
on the positive numbers y[s,j) just ensure that it is a test function, and hence the program becomes 


(64) 


Err (Ext, k) = maximize 

Zi 


subject to 


-■-Y 

(sj) 


Y1 


0<Zi< 2-^ Vi e [2"] 



1 

2m 


the total variation distance of the output distribution to the uniform distribution on m bits. The 
average over the choice of the seed value s outside of the absolute value ensures that the closeness to 
the uniform distribution holds even conditioned on the seed. We also call 


(65) 


C{Ext,k) := Err(Ext,/c) 


the classical value of Ext. We can now apply our general quantization procedure to (63). Assuming 
for simplicity that the underlying Hilbert space is of finite-dimensions and repeating the steps (22) 
- (24), we arrive at the program (for the general case see again Appendix A.l), 


( 66 ) 


Err*(Ext,A:) := 


maximize . 

2^ . 




^fs{i)—j <2m 


Tr 




subject to cr E 0, Tr[cr] = 1 

0 E Eli E 2“''! Vz G [2*"] 


0 E E 1 V(s,j) G 


'\d+m 


Setting fjj := and again by the duality of the 1-norm to the oo-norm we can rewrite the 

program as 


(67) 


Err*(Ext, k) = maximize 

o-i 


subject to 


-■-Y 
2 2'^^ 


(«V) 


E 





1 


0 E O-i E 2"^ ^ cTi Vi G [2"] 

i 

Y Tr[ui] = 1. 

i 


Erom this we define the normalized classical-quantum state 


( 68 ) 


cj := 




o-i 


satisfying a <2 ^ • 1 


E 


o-i 


and hence the objective function in (67) corresponds to the total variation distance of the output 
to a quantum state that is of the form uniform distribution on m bits tensor the reduced state on 
the quantum system. This means that an adversary cannot tell the output apart from the uniform 
distribution even when having access to the quantum system as well as the value of the seed. Here, 
the inequality condition in (68) defines the worst case quantum conditional min-entropy that is, e.g., 
discussed in [56, Appendix B]. However, in the literature the average case quantum conditional min- 
entropy is more commonly used (as discussed in [50]). This gives rise to the following so-called quantum 
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value of Ext, 


(5(Ext, fc) := maximize 

(cri,ui) 


-■-y 
2 ^ 


E 




1 

2m 


(Ti 


(69) 


subject to a; E 0, Tr[a;] = 1 

0 E iTi E 2-^oj Vz e [2’"] 
y Tr[o-i] = 1. 

i 

However, it follows from the equivalence of the worst case and average case quantum conditional 
min-entropy [56, Lemma 20] that there cannot be a large gap between Err* and Q. 

Proposition 3.1. For e > Q we have 

(70) (5(Ext, k) < Err* (Ext, k — log (l/e^ + l)) + ^ • 

We conclude that Err*(Ext, A:) captures to what extent Ext is a quantum-proof extractor. Hence, 
this property can be tested by our SDP hierarchy (32). We give the full hrst level Err^'^Pi (Ext, fe) 
as in (37) in Appendix E. For our purposes, however, it will be sufficient to work with the following 
simplified upper bound Err^'^^i (Ext, fc) > Err^'^Pi (Ext, fc) that ignores some of the constraints: 


1 


Err^dPi (Ext, fe) := maximize —r 


2d 




A . 

fs(i)—j 2m 


ni 


(71) 


subject to e Pos(l + 2'^ + 2 ^+™-]) 

> 0 yw,w'e'Si 

i 

Vi e [2”], Vw e E, 

fih 4 V(s.i) £ [ 2 "+”] , v«, £ El. 

where again some of the positivity constraints on the matrix elements are new as compared to the 
NPA hierarchy. We emphasize that these conditions are important to obtain the following bounds on 
the gap between Err(Ext, k) and Err®dPi(Ext, k), which then also give an upper estimate for the error 
of the quantum-proof extractor (66). 

Theorem 3.2. We have that 


(72) Err'dPi (Ext, k) < \/2v^\/Err(Ext, fc), 
as well as 

(73) Err^i (Ext, k) < 6 Kq 2”“^ Err(Ext, A: - 1) 
where Kq denotes Grothendieck’s constant. 

The proof is based on ideas from [9, Theorem 5] and we present it in full detail in Appendix E. 
We remark that compared to the relaxation in [9, Theorem 4], the SDP relaxation (71) has some 
new and different constraints. The additional constraints are introduced by the sub-matrices where 
one variable is equal to the empty word 0. Using these additional constraints we have the desirable 
property that the first level SDP relaxation Err®dPi(Ext, k) is always bounded by one,® 


(74) Err"dPi(Ext,A:) < 


1 

¥ 


E 


1 - 


1 


n 


< 


1 


(*)i(«d) — 2d 


Eb 


1 


^( i ).0 - 1 “ 2 


1 


h{s,jy.fs{i)=j 

This implies that the argument in [9, Theorem 8] showing a large gap between the SDP value and the 
quantum value does not apply for Err®dPi(Ext, k). We leave it as an open question whether there can 
be a large gap between Err^^Pi (Ext, k) or Err^dPi (Ext, A:) and Err*(Ext, A;). 


®Both Err* (Ext, k) and Q(Ext, k) are always bounded by one whereas the relaxation in [9, Theorem 4] can get arbitrarily 
large in general. 

^Some more results on how to extend the argument in [9, Theorem 8] to other SDP relaxations can be found in [23]. 
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Finally, we point out that using ideas similar to the ones presented in this section, one can also 
construct a hierarchy for more general objects called quantum-proof randomness condensers [57, 8]. 
It would be interesting to explore in more detail the applications of these relaxations to condensers. 


3.4. Optimization over the cone C5+. Here we show that one can use the hierarchy introduced 
in Section 2 to give a SDP hierarchy of outer approximation for the cone CS^ defined in [40], 

(75) CS^ := |r e Pos(iV) : = Tv[XaXfj] with Xi,...,Xn £ Pos{d) for some d e n| . 

A typical program considered by Burgdorf, Laurent and Piovesan [13] now reads as follows: 



pC5+ {F*}j] := maximize 



Q,/9 

(76) 

subject to 

KgCS^ 



YKf3^a,f3 = G^ Vi 





Here again, are real numbers specifying the objective function, and the real numbers G* 

specify additional equality constraints. Specific instances include the quantum versions of stability 
and chromatic numbers for graphs; see e.g., [16, 13]. Note that as we do not distinguish between two 
types of variables here, we use N instead of A^ + M for the number of variables. As in Assumption 2.1, 
we assume that the constraints /3 ~ such that they imply Xa ^ Cld for some 

constant. For all applications we know of, this is satisfied. 

The above optimization problem is closely related to the tracial moment problem, tracial opti¬ 
mization of non-commutative polynomials as studied extensively by Burgdorf, Cafuta, Klep, and 
Povh [14, 15, 37]. In particular, Klep and Povh [37] studied the optimization problem of minimizing 
the trace of a polynomial in non-commutative variables under further positivity constraints and de¬ 
rived a convergent SDP hierarchy. In what is next, we describe how our general approach can be used 
to derive a new hierarchy especially suited for quadratic polynomials and thus for optimization over 
CS^. 

Following the procedure given in Section 2, the re-th level SDP relaxation is given by optimizing 
over a positive semidefinite matrix D” whose rows and columns are indexed by words of length up 
to n on the alphabet {1,..., A^}. These words span a the complex linear subspace of Sqo which we 
denote by The entries corresponding to words of length 1 are the candidate entries for 

Aa ^/3 in the program (76). The fact that A G CS^ allows us to add additional constraints as described 
in (31). When n is odd and writing 


(77) 


<5 


N^+i _ 1 
A^- 1 


and 6' 


jY(n-l)/2+l — l \ ^ 

A^- 1 ) 


we find 
(78) 

sdp„[A, {F^}i] := maximize 
subject to 


D" G Pos (5) 

^ 0,0 = 1 

^r*oHos,n*o(/3)oJ^)(s| ® G Pos((5') 

^1a)ou,vo{j3) = ^uo{(3),{a)ou Vo;,/3 G[N],U,V G S„_i 

E = G-n;,,. 

a,13 


Va, /3 G [N] 


Recall that o denotes the concatenation of words and (a) refers to a word of length 1 with the 
symbol a. Note that n = 1 corresponds to optimizing over the doubly non-negative cone. The way 
we constructed sdp„[A, {F*}j] as a relaxation of p[A, {F*}j] is similar to what we did in previous 
sections. Let A G CS^, then there exists positive semidefinite matrices X[,..., X'j^ G Pos(d) such 
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that = Tr[X^X^]. First, let us write Xa = VdX'^ and for any word u € Xu as the product 
of the matrices corresponding to its symbols: Xu = Xu^ ■ ■ ■ Xu^ with X 0 = 1. Recalling that u* is the 
word u inverted, we define 


(79) 


(7^ •= Tr 


, ■ Xu* Xu 
a 


First, as is the (scaled) Gram matrix of the family {Xu '■ u G Fl„}, it is positive semidefinite. Also 
17” = Tr [l/d] = 1. Moreover, for a vector \(p) = Ylr u'^r,u\r)\u), we have 

(</>! ^r*o(a)o5,n*o(/3)oJ0(s| ® 


(80) 

(81) 

(82) 


Xs* Xq^X^Xu* Xj^Xy 


= Tr 


= Tr 


^ ^ Cs^V^vXg* J A^q. I ^ ^ Cy^uXyXu* j X 

\ s,v / \ r,u / 

'y ^ Cs^vXyXg* j Aq, I ^ ^ Cs^yXyXg* j X^ 


The constraint 
(83) 

corresponds to the cyclicity of the trace. 


. S,v 


O” — O” 

^^{ol)ou,vo(P) ^^uo(i3),(a)ou 


> 0 . 


(84) 


d • ^'{a)ou,vo{l3) — Tr 


Xu* Xfy^XyX 


= Tr 


X0Xu*X^Xu 


= d-a 


uo[fi),{a)ov ■ 


Note that such a constraint did not appear in our other examples as we were optimizing over the state 
involved in dehning 17”. In this example, we want to fix the state to be maximally mixed, 1/d, and 
this is reflected in the cyclicity condition. We can also define the SDPs for even n similarly as in (33). 

We implemented the SDP relaxations to test whether a given matrix K is in C5-|_. In [24, 40] it 
was shown that the matrix 


(85) 


K := 


'4 0 2 2 O' 
0 4 0 2 2 
2 0 4 0 3 
2 2 0 4 0 
.0 2 3 0 4, 


is not in the closure of C5+. Using CVX for MATLAB [28, 27], we were able to numerically certify 
using level n = 3 of the hierarchy that the matrix is indeed not in the cone C5+.® 

The convergence proof of Theorem 2.3 covers the above case as well, which then raises the question 
how the limiting point 


( 86 ) 


p*[A,{F^]i] := hm sdp„ [A, {F*},] 


of the programs (78) can be represented. Not surprisingly, we cannot assert that it corresponds to an 
element in the cone C5+ which asks for an underlying finite-dimensional Hilbert space. However, as 
shown in the Appendix B, the assumption that Connes’ embedding conjecture [44, 20] has a positive 
answer implies that the value p*[A, {F*}j] agrees with the program (76), or more precisely, with its 
value if optimized over the closure C5+ of the cone C5+.® 


Corollary 3.3. For any n > 1 we have 

(87) [A, {F'},] < sdp JA, {F*}i]. 

Moreover, provided the Connes embedding conjecture has a positive answer [20, 44], we have 

(88) p^+[A,{F%]=p*[A,{F%]. 


Q 

The code is available at http://www.omarfawzi.info. 

*^We write maximize in (76), which is consistent with the statement that the maximum is not attained (cf. Footnote 1). 
Clearly, the limiting point of our SDP hierarchy then corresponds to the supremum, and hence to the optimization over 
the closure of the cone CS+. 
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In order to prove (88), we could first either relate to Klep and Povh’s result [37] or make use of 
the fact that our hierarchy converges to the same value as the NPA hierarchy with added cyclicity 
constraints. Both approaches would imply that the state r on Sqo constructed in the convergence proof 
is a tracial state, that is r(a6) = T{ba). However, if Connes’ embedding conjecture holds, then this 
state can be represented as a tracial state on the ultrapower of the hyperfinite factor. Finally, Burgdorf, 
Laurent and Piovesan [13] have shown that this implies the stated result. For the convenience of the 
reader, we present such an argument in Appendix B.2. 


Appendix A. Missing Proofs for General Hilbert Spaces 


A.l. Upper Bounds on the Quantum Value. Here we show that even if we allow for general 
Hilbert spaces in the quantum program (7), than SDP hierarchy (32)-(33) is still a relaxation thereof 
(in Section 2.4 we have only shown this for finite-dimensional spaces). For this we start from the 
quantum program (7) and upper bound it in a more algebraic form. 

Given any feasible solution E^, Djs) of the quantum program (7) we consider the algebra 

generated by the operators 

(89) Di,...,Dm, 

acting on the Hilbert space "H, and denote its closure in operator norm by P. This is then a C'*-algebra 
and we denote the set of of hermitian functionals on P by P)( and the set of positive functionals by 
Now the normalized vector ip gT-L induces a normalized positive functional a S P^ via 

(90) V3 D^a{D) := {'ijj\Dijj). 

Moreover, the hermitian operators induce positive functionals pa G P^, 

(91) V3D^p^{D)-.= (i/;|P„PV’)= (V-V’), 


where the last equality follows from the commutativity constraint [Ea,Dp\ = 0. In order to find an 
upper bound on the quantum value p*[A,E], we consider the following optimization program over all 
C*-algebras P, 


(92) 


p[A,T] := maximize 

{V,Pc,Dp) 

subject to 


^ ^ ■^a,j3Pai^Dp) 
a,0 

Pa £ P/), cr G V\ with cr(l) = 1 
ci(pi,... ,PAr,0') ^ 0 ^qGQ 

k{Di,..., Dm) ^ 0 V/c €/C , 


where the constraints g G Q are now understood as 

(93) g{pu... ,pN,(y) ■■= g'^cr + ^ , 


and positivity is read in the algebraic sense. Note that the boundedness constraints (cf. Assump¬ 
tion 2.1) translate to 

(94) Va e [N] : CahPah -Ca and V/3 e [M] : ClhDph -Cl . 

Now we show that the SDP hierarchy (32)-(33) is an upper bound on the algebraic program (92), and 
with that also on the quantum program (7). 


Proposition A.l. For any n S N we have that,^^ 

(95) sdpJA,E]>p[A,E]>p*[A,E]. 


Proof. The second inequality follows from the discussion above and we now prove the first inequality. 

Let PajCr be the set of functionals associated to the optimal solution p[A,E]. A standard GNS 
construction for the state a gives rise to a Hilbert space H, a dense mapping i : P —)• "H, a vector 
^ = i(l) and a representation vr : P —)■ B(fH) dehned by 7r(x)z(a) = i{xa), such that 

(96) {i{a) \i{b)) = a{a*b) and a{x) = (^|7r(a:)^). 

For the sake of convenience, we identify Dp with n^Dp). By the von Neumann commutant theorem, 
the double commutant vr(P)" of vr(P) is a von Neumann algebra, denoted by A4, and the vector 

will see in Appendix 2 that even p[A, T] = p* [A, T], 
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^ defines a normal state on Ai. Now, by [59, Theorem 2.2], there exists an anti-unitary operator 
J : H ^ Ti, satisfying = 1, another vector ip gH (differing from ^ by at most a phase), such that 
for all y G 

(97) {^\Y^) = {iPlYiP), J'iP = 'iP, and {ip\Y JY^jj) > 0. 

Moreover, we have that JAiJ = Ai', meaning that for any operator X in the commutant of Ai there 
exists an element Y G Ai such that JYJ = X. By the non-commutative Radon-Nikodym derivative 
argument, see, e.g., [55], setting 

(98) ha-T-L^'H, {i{a)\hai{b)) = pa{a*b) 
defines an operator which is positive and bounded, since 

(99) 0 y (i(a) \hai{a)) = pa{a*a) Y Ca{a*a) = C {i{a) |i(a)). 

A standard calculation also gives that G Ai'. Moreover, for any linear constraint ^(pi,..., pn-, <t) ^ 
0 we have 


( 100 ) (i(a) \g{hi ,..., /iat, l)i(a)) = g{ {i{a) \hii{a)),, {i{a) |/iArf(a)), (z(a) \i{a ))) 

( 101 ) =p(pi,...,P 7 V,cr)(a*a) > 0 

and hence p(/ii,...,/ itv, 1) defines a positive operator. By the previous assertions, we have that 
Ea = JhaJ is an element of Ai and likewise g{Ei, ..., 1) ^ 0 . 

We have all necessary ingredients at hand to define the analogue of the bilinear form uj : Sqo x Sqo 
C from (25). First, let us abbreviate for 7 G {1, ... , A -|- M} 


( 102 ) 


Zy : = 


Xy-. kG N} 

Yy_N ■■ k G {N + 1,...N + M} 

and for any word u = (ui,..., up), Ui G {1,... ,N + M}, 

( 103 ) Zu '■= ■ ■ ■ Zu^ ■ 

For any two words u,v we set 

( 104 ) u}{u*,v) := (iplZyJZuip) , 


and this also defines the matrix Q as in (26). This matrix is positive semidefinite by 

(105) ^ ^ ,v) = ^ ^ \u\y{lP\ZyJZylP) = (| ^ A, Z, J ^ A, l/' ) > 0 , 

U^V U,V V u 

where it is essential that J is an anti-unitary operator. Moreover, property (27) is checked by 

^ ^ C-su^tv ^s*t,u*v — ^ ^ (jj(y{^S t) ^IL v) 

s,u^t,v s,u,t,v 

= '^Ctv {ip\ZpZyJZpZtip) = y] cXlcty{Zuip\JJZyJZpZtip) 

s,u,t,v s^u,t^v 

(106) = yy {Zyip \JZlZtJZyip) = yy '^ctv{ZtJZyip\ZsJZuip) >0, 

s,u,t,v s,u,t,v 

which defines a positive matrix. The linear constrained assertion (31) follows in a similar way. From 
the previous definitions, we have that 

(107) p^{Dp)= {ip\haDpip) = {ip\JEaJDpip) = {ip\DpJEaip) = VLi^a),{p) , 


and hence the (infinite-dimensional) matrix 11 fulfills the constraints given by any finite level n as 
in (32). □ 
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A.2. Asymptotic Convergence. Here we show that the hierarchy (32)-(33) asymptotically con¬ 
verges to the quantum value (7). The argument follows previous works [22, 47, 42], 

Theorem A. 2. Let sdp„[A,T'] denote the SDP hierarchy (32)-(33) of the quantum bilinear pro¬ 
gram (7), and assume 2.1. Then, we have the following: 

(1) In the limit o/n —)• oo the optimal solutions of the programs sdp^[A,J-'] converge to a finite 
value, 

(108) lim sdp„[A,T"] =p[A,IF]. 

n—^oo 

(2) There exists a Hilbert space Ti, a normalized vector!^ gH, a *-homomorphism vr ; Sqo —)• B{Ti) 
as well as a linear and positive mapping ip : Soo —)• BiTi) with commuting ranges (that is, 
[</?(o), 7r(6)] = 0 for all a,b G T^oo) o.s well as elements Za,yp G Too such that 

(109) p[A, T"] = ^ Aa^p iC\T{za)Tr{yp)C) ■ 

a,P 

Moreover, the constraints given by the linear functions g G Q and k G K, are all satisfied, 

(no) g{ip{zi),...,ip{zN)) hO, as well as k{-K{yi),... ,-K{yM)) h 0 ■ 

Since the quantum bilinear program (7) is a maximization over all all expressions as on the right- 
hand side of (109) under the constraints (110), it immediately follows that 

(111) p*[A,H]>p[A,H]. 

Now because inequality in the other direction was already established in (95), we conclude that the 
hierarchy (32)-(33) asymptotically converges to the quantum value (7), 

(112) P*[A,T'] = lim sdp„[A,J’]. 

n—>-oo 

Furthermore, the optimal value p[A, H] of the algebraic optimization (92) also becomes equal to the 
quantum value 

(113) p*[A,H]=p[A,H], 
again by (95). 

Proof of Theorem A.2. We first note that due to Assumption 2.1, the positivity constraints provide a 
bound on the diagonal elements of the d(l) x d{l) sub-matrix 11”^^ 

(114) 

and thus on its trace. Hence, we find that 


(115) 


0 < 


a.,/3 


<\\A\\d{l)C\ 


Moreover, we have sdp„[A,T'] < sdpj„[A, T"] for n < m. Thus, the sequence sdp^[A,T'] is monotoni- 
cally decreasing and lower bounded by zero, hence converging to a finite value p[A,F]. 

In order to proceed, we need another expression for the limiting point p[A,F]. More precisely, 
we have to examine in which way the limiting point can be seen as being specified by an infinite¬ 
dimensional matrix, capturing the constraints on all words of all possible lengths at once. For any 
n G N we have the subspace = { o G Sqo | a = Ylwi{w)<n ^ww}- Furthermore, for n odd we define 
the two families of cones 


(116) sym(S„) := <| x G (g) | x = ^ Xtof (g) a*, o* G , A* > 0 


(117) {Tin <g> A]„)_|_ < 


X G 


(g) I X = ^ a(fak (g) blfbi , a^, b^ G T^n-i)/2 


fjex 

kl 
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Let ri"" be a feasible point of the n-th level of the SDP hierarchy (32). By mapping a pair of words 
u,v € Tin to 12” we specify a linear functional uj on Tn <8) Tn, and it is easily seen that the constraints 
on 12” imply that 

(118) a;(sym(B„) U (B„ (g) B„)+) > 0, a;(l) = l. 

For the value 

(119) p'[A,J^] := inf |g : 3n with ql — a,i3Za ®yi3^ sym(S„) U {Tn ® B„)+| , 

a.,13 

we find that for a finite e > 0 there exists an n G N such that 

(120) p'[A, + e > ^ Aa,l3^'^a),{^) ■ 

a,/3 

Hence, we have p'[A,J^] > p[A,J^]. But by exploiting the Positivstellensatz of Helton and McCul¬ 
lough [30], a duality argument shows (see, e.g., [22]), 

(121) p'[A,J^] = sup ^\'^Ao,,l 30 j{za'S'yi 3 )\ : w(sym(Soo) U (Soo C) Boo)+) > 0, a;(l) = l| , 

a,/3 

which then implies p'[A,J^] = p[A,J^]. In the following, we show how to construct a Hilbert space and 
associated representations, starting from uj. 

As usual, the argument is based on a GNS construction, and closely follows the ideas of Woronowicz 
in his study of purifications for states on C'*-algebras, [60]. We first turn the free algebra T^o into 
a C'*-algebra, that is a norm-closed algebra such that we have ||a:*x|| = ||a:||^. This is achieved by 
defining for x G Sqo 

(122) ||x|| = sup { ||7r(x)||g(-^^) : tt : T^o —^ H(22,r) a *-representation} . 

Here, a *-representation is a algebraic homomorphism of Sqo into the bounded operators on some 
Hilbert space T-L such that the *-involution is mapped to the usual involution. It is easily checked 
that this norm satisfies our requirement, and thus the topological closure of So© under this norm is a 
C*-algebra, which we denote by A. For all x* G Sqo the Assumption 2.1 implies 

(123) 3C : Clh xj , 

ensures that ||xj|| < VC and hence Xj G A since by definition of positivity in A there exists Wi € A 
with x*Xi + w*Wi = Cl and we have for any vr : A —)• B{T-i) and any ■0 G 22, \\^p\\ = 1 

(124) {ijj \7r{x‘f)ijj) < {Trixi)^) [7r(xi)) -h {^{wi)^) \TT{wi)) = C. 

We also define the opposite C*-algebra A, which is as topological space equal to A equipped with the 
multiplication rule a ■ b = ba for a, 6 G A. Following [59], we denote a* as seen as an element of A by 
a. Then the mapping a i—>• a is a ^-invariant, anti-linear multiplicative isometry from A to A. 

Let M (g) M be the maximal C*-tensor product of A and A, see for example [48] for a precise 
definition. On this algebra, we can define another ^-invariant, anti-linear and multiplicative mapping 
j : As A ^ As A satisfying = id by setting 

(125) j{aSb) = bSa. 

We define a state s on M (g) A by setting 

(126) s{a S b) = uj{a*,b), 

for words of finite length a, b and then extending to the closure. Normalization is immediate and 
positivity follows from property uj{{Tn S Sn)+) > 0, 

(127) s{{a S b)*a S b) = s{a*a S b*b) = uj{a*a, b*b) 
since x ■ y = xy. 

Carrying out the standard GNS construction for the state s gives rise to a Hilbert space 22, a dense 
mapping i : A S A ^ a vector ^ = 2(1) and a representation tt ■. As A ^ B{7i) defined by 
7r(d S b)i{c S d) = iijxc S bd), such that 

(128) (i(a S b) \i(c S d)} = uj(c*a, b*d) 

(129) s(aSb)= l7r(a S b)^}. 
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We now define an anti-linear operator J by defining it on the dense domain i(A (8> A) as 

(130) {i{a 0 b) \ Ji{c ® d)) = {i(a (g) b) \i{d^ c)} . 

Its adjoint equals itself, since io{a*,b) = oj{b*,a) due to positivity and 

(131) (Ji(a (g) b) \i{c <g) d)) = {i(b ® a) \i{c (g) d)) = oj{c*b, a*d) = Lo{d*c, b*c) = {i{d (g) b) \Ji{c (g) d)) . 

Moreover, we find that = 1 and hence J can be extended to an anti-unitary involution on 
Furthermore, we have 

(132) J7r(a (g) b)Ji{c (g) d) = i{bc (g) ad) = Tr{b (g) a)i{c (g) d) = vr(j(d (g) h))i{c (g) d) 
and hence J7r(a (g) b)J = vr(j(a (g) 6)). A similar calculation gives 

(133) 7r(a (g) 6) = J7r(j(a (g) 1)) J7r(i (g) 6) = 7r(i (g) 6) J7r(j(a (g) 1)) J. 

Hence the image of the linear mapping 

(134) y? : a I—>• 7r(a* (g) 1) = J7r{j{a* (g) 1)) J 

is contained in the commutant of 7r(l (g) A). Moreover, any positive element a*a G A gets mapped to 

(135) (p{a*a) = 7r(o*o (g) 1) = 7r(a (g) l)*7r(o (g) 1), 

which is a positive operator. This proves (109). The last assertion (110) follows similarly. Considering 
a linear constraint k{yi,... ,yM) £ A, we find evaluating the diagonal matrix elements of 7r(l (g) 
k{yi,.. .,yM)) that 

(136) {i{d (g) b) |7r(l (g) k{yi ,..., yM)i{d <g) 6)) = {i{d (g) b) \i{d (g) k{yi ,..., yM)b)) 

(137) = uj{a*a,b*k{yi,... ,yM)b) >0 . 

Hence 7r(l (g) k{yi ,..., i/m)) is a positive operator. A similar derivation can be carried out for the map 

(f. □ 


Appendix B. Implications of Connes’ embedding conjecture 

In this appendix, we discuss the implications of a positive answer to Connes’ embedding conjec¬ 
ture [44, 20] to our hierarchy. We first give a short sketch of an argument why a positive answer 
to Connes’ embedding conjecture implies that the optimization in the program (7) can be restricted 
to finite-dimensional Hilbert spaces, though it does not imply that this supremum is also achieved. 
In the second part of this appendix, we sketch the argument for the case of the completely positive- 
semidefinite cone C5+. As we do not want to go into the details about Connes’ embedding conjecture, 
its different forms and its far reaching consequences (independent of the actual answer), we refer the 
interested reader to the extensive reviews of Ozawa on the topic [43, 44]. 

B.l. General case. In Theorem A.2, we found that the limiting point of our SDP hierarchy can be 
expressed as 

(138) p[A, = k°^(2:a)vr(y/3)C), 

a,/3 

where vr is a representation of the universal enveloping algebra A of Sqo, and is a representation 
of the opposite algebra A. Let A7 be the von Neumann algebra generated by 'k{A). Since we assume 
that Connes’ embedding conjecture holds, all von Neumann algebras satisfy Kirchberg’s QWEP prop¬ 
erty [36] which implies that M = B/ J, where the C*-algebra B has the WEP property, and J" is a 
two-sided ideal in B. Since yp are assumed to be hermitian elements, the Cayley transform t/]g of vr(y^) 
is a unitary operator. Let tt : C'*[FAf] —)• A7 be the *-homomorphism dehned by >--)■ [/„, where Sa 
are the generators of the free group of M elements (C*[Fm] is the corresponding universal free group 
algebra). We apply the same procedure to get another *-homomorphism 7r°P : C*\^m\°^ Tr°P{A). 
Now, C*[FjVL] as a free group algebra satisfies the Lifting property [43], and thus the mapping 

(139) TT°P ® TT : C*[¥m]°^ <S) C*[¥m] ^ 7r°P{A)TT{A) 

is continuous with respect to the minimal tensor product, see [36, Proposition 1.3 (iv)]. Correspond¬ 
ingly, the state w defined by the vector ^ extends to a state w on the minimal tensor product. As 
in the proof of [44, Theorem 28], we can assume that the induced representation of C*[¥m]°^ indeed 
reduces to the opposite representation of 't(C*[Fm]) on %. Now we know that C'*[Fm]°^ ® C'*[Fm] 
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acts on S 2 {H) = T-L as (s (8) s)(x) = sxs. Since the state ch can be approximated by a normal 
state [34], by inverting the Cayley transform we find that for any e > 0 there exists Hilbert-Schmidt 
operators Xi G S 2 {H) an hermitian elements Za, y/^ such that 


(140) 


(C ^ AiTr [x*ypXiZa] 

i 


<e. 


But since the state u originates from an maximization, we can assume that only one term (say given by 
X G S 2 {H) in the above sum is non-zero.It follows from (^ l'^) = 1 that Tr[x*x] = 1 and hence we can 
by an approximation argument assume that x is of finite rank, wit support projection p. Projecting 
the hermitian elements as well, the form 


(141) 


Za^yp^ Tr [x*pyppxpzap\ 


is seen to satisfy all the required constraints. In order to bring it into the form (22), we let a = \xp\'^ 
and find with x = u\x\ the polar decomposition of x that Tr[(T] = Tr[|x|u*tt|x|] = Tr[x*xp] = 1 as well 
as 


(142) 


Tr [x*pyi3pxpzap] = Tr 


1/2 1/2 * 
a ' y^a ' uZqU 


B.2. Completely positive semidefinite cone. Theorem A.2 also applies to this case, but we get 
also from the hierarchy that the constructed state fulfills in addition the cyclicity constraint. More 
precisely, let s be the state on M <8* ^ constructed in the proof of theorem A.2. Note that in this 
setting, we do not distinguish two kinds of variables and hence A is the free C*-algebra generated by 
N positive elements Zq, for a G {1,... ,N}. The cyclicity constraints, which are added to each level 
also hold for the state s, implying that we have 

(143) s{za o u ®vo zp) = s{u o zp 0 zao v), 

where u,v are arbitrary words in the variables Za- Applying this identity recursively to the choice 
Zj 3 = I, we find for u = Za^ z^^ ■ ■ ■ Za^ 

(144) s{u <Si v) = s(Zai Za 2 ' ' ' ® v) = s(Za 2 ' ' ' Za„ <8) O U = . . . = s(i igi U* O v) . 

Moreover, by the same trick we find 

(145) s(i (8> u O Za) = s(^a <S> u) = s(i ® ZaO u) , 

and hence s(I(8)uou) = s(I(8)uou). These equalities can be linear extended to hold for all finite 
polynomials tt, u G M in the variables Zq,, which is a dense subset. They are hence true for all u, u G M. 
Since s is a state on M (8) M, s(I (8) .4) is a state r on A, and the constraints just derive imply that it 
is a tracial state, r(a6) = T{ha) for a,b £ A. This is also the state which is constructed by the NPA 
hierarchy, if we would follow the proof steps mentioned in the main text. 

It follows from these considerations that the limiting point p* of our SDP (78) can be written as 

(146) p* = '^ Aa^i3T{Za Zp) . 

a,(3 

Let TTr be the GNS representation of the state r, and let Tr-riA)" be the finite von Neumann algebra 
generated by it. If Connes’ embedding conjecture holds, then tt-tIAY' embeds into an ultrapower of 
the hyperfinite factor, preserving the tracial character of the state. Let 6 be this embedding. Then 
we have 

(147) p* = '^Aa,pTo9-^{9{za)0{zp)), 

and Burgdorf, Laurent and Piovesan [13] have shown that matrices of the form r o 9~^{9(za) &(zp)) 
belong to the closure of the cone CS+. 
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Appendix C. Generalizations concerning constraint sets and objective functions 


In the main text we only considered linear inequality constraints on the non-commutative variables 
(expressed by the set However, more general constraint sets can also be studied with our 

approach. 

In particular, equality constraints can be already included into the free algebra. For example, let 
q be an irreducible polynomial with variables in Sqo, such as q{z) = — z. The requirement that 

q{zi) = 0, q{yj) = 0, i = 1,..., A, j = 1,..., M, then corresponds to allowing only projection valued 
operators. If we denote by (g) the ideal in Sqo generated by (?, then we can form the quotient *-algebra 
T,oo/{q) which intuitively can be understood as starting with the free *-algebra Sqo and then imposing 
the constraint q. We can adopt our procedure for deriving the programs (32) to this new algebra, 
by defining the bilinear form (25) on the new algebra 'E,oo/{q) and then following the same procedure 
as before. However, since the simple monomials are not longer a basis for this quotient algebra, the 
derivation of levels now relies on first obtaining a monomial basis for Tioo/{q)- This can be achieved 
if a finite Grobner basis exists and is efficiently computable, as already explained in [45, Section 3.5]. 
Alternatively, the equality constraints can also be achieved by requiring that certain matrix elements 
of 17 are identified with each other. For example, for the constraint above we would have 


^u,vo(i)o(i)ov ^u,vo(i)ov j 


(148) 

for words u,v G Soo and i = 1,..., N + M. 

Apart from adding polynomial equality constraints, also generalizations concerning the objective 
functions are possible. Up to now, we only considered the case of bilinear terms. However, terms 
which are linear in just one variable or constant can be added if we allow for the objective matrix A 
to have also support on words involving the empty word 0. That is, objective functions of the form 


(149) 


E4 




“h ^ ^ Qjfy^ 


«,0 + ^ ^/3^0,/3 + cOg 0 


a,13 « fH 

fit into our framework. They correspond to optimizing a functional not only depending on the (quan¬ 
tum) correlations, but also on the marginal distributions. 


S*(Z,I) > i + ^R. 0.908. 


Appendix D. Entanglement-assisted noisy channel coding 
Here we show that for the channel Z defined in Section 3.2, we have 

(150) 

For that, we give a quantum protocol using a four dimensional maximally entangled state 

(151) IV’) := ^ X] ® 1*^ • 

iG[4] 

For sending the bit 0, the sender performs a measurement in the computational basis E{x\0) = |x)(x| 
and for sending the bit 1, the sender performs a measurement in the rotated basis E{x\l) = U\x){x\U^ 
with 

(152) U = ^[Wl\ 

v3 V-1 -1 1 0. 

The possible outputs of the channel can be labeled by subsets of the inputs of size 2. We can write 
the success probability as 

\ (V’lk)(a;| ® T>(0|{x,x'})|i/))-b ^ {'iIj\U\x){x\U^ ^ {id-D{0\{x,x'}))\'ip) 

x£[^^x'^x a:€[4],x'^iE 

|x)(x| — U\x){x\U^) (g) Z7(0|{x, x'})\'ij3) 


(154) 

(155) 


1 1 

2 6 Y 


xG[4],x'^x 


1 1 V- 

- + - • 2 > 

2 6 ^ 

{x,x'}e(^) 




think of the commutativity assumption not as of a constraint, but rather as part of the definition of a quantum 
bilinear program 
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By choosing L)(0|{x,x'}) to be an optimal measurement to distinguish between the states 
(156) + lx')(a;'|^ and ]^u{^x){x\+ \x'){x'\^U^, 


we get a success probability of 


(157) 


1 1 1 
2 ^ 6 ■ 4 


E 


|x)(a:| + 


- U 


|x)(3:l + \x'){x'\ 


l/t 


_ 1 1 
“ 2 + 7i- 


{x,x'}&Q 

Appendix E. Quantum-Proof Randomness Extractors 

Here we give the missing proofs for the claims in Section 3.3. The full first level of our SDP 
hierarchy (37) for quantum-proof randomness extractors is as follows: 

(158) 


Err®*^?! (Ext, fc) = maximize , 


aE 




^fs{i)=j 




subject to e Pos(l -|- 2"' -|- 2'^+™') 


^0,0 “ ^0,40 

= E 

i 


V 

1,40 

^0,40 - ^{SJ),W 

V(s 

‘)-2k 1 q1 

> 2“* 


> 

2“^ + 

> 2~^ 


~\d+m 


Vw € El 


i-fcol 


-)d+m 


-)d+m 


The upper bound (71) is then immediate by ignoring some constraints. 

Proof of Theorem 3.2. The ideas for the proof are from [9, Theorem 5]. We first prove (72). For that 
we relax the positivity constraint in (71) from 


(159) 




-)m+d 


and ignore some of the other constraints in (71) leading to 

. . 1 


Err^dPi (Ext,/c) < maximize —j 

^ - 111 2d 


E 


^fs{i)—j ‘2m 


q1 


(160) 


subject to G Pos(l + 2'^ + 2^+™) 

i 

Vi'e[2”] 






0 = 1 


V(s,j)G[2-+d 

Moreover, we write G Pos(l + 2"^ + 2^+™-) as a Gram matrix: 

(161) =: au ■ Su', =: a„ • K, H), =: Vu,u' G [2""] U {0}, Vu,u' G 


)m+d 


U{0} 
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An application of the Cauchy-Schwarz inequality then easily reveals that the optimal choice for 
is 

(162) 

Thus, the upper bound program becomes 


(163) 


becomes 


=i 

2m ] 


)=i “ 

2^] II 2 

maximize 

fE 

E 

A . .-J-’ 

2777, 


(«h) 

i 


subject to 

0 < Oj • 

(Xi^ 2 ^ * ^2^ 


^Si- 

Si' < 2-’^ Vz' G [2” 


E«5- 

(li> 

1. 


l^V 


Again using the Cauchy-Schwarz inequality we can write 

(164) 


sE 


(«h) 


2^ 

Letting (again), 

(165) 

we look at the expression 


E 

y . -ly 

fs{i)—j ^rn 

(Xi 

< 

\ 

fE 

M 
- 1 

i 



2 \ 

(s,j) 

i ^ 




■ 




(166) ^ 
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where we made use of the constraints in (163) for the last inequality. Going back to the error Err(Ext, k) 
as in (64) we conclude the claim. 

We now prove (73). We upper bound Err^'^^i (Ext, fc) by forgetting several constraints and then 
apply Grothendieck’s inequality (see Lemma E.l below): 


(169) Err"'^Pi(Ext, k) < max ^ ^ ^ ^ ' hs,j) ■ l|oi|l 2 < 2 
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We partition the set of z € [2”'] into {i : a* > 0} and {i : a* < 0}, and write 
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Let US write 

(173) 

Now if a+ > 1, then we define 

(174) 
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Observing that <2"' we have 
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(176) < 2a+Err(Ext, k + log(Q:+)) 

(177) < 2-2'^"^Err(Ext,/c), 

with the error Err (Ext, k) as in (64). Otherwise, if q;+ < 1, then we define 

(178) P+(^) ■= niax{aj, 0} + (1 — q:+)2“” . 

We have 
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(181) < 2Err(Ext, A: — 1) + 2(1 — Q:+)Err(Ext, n). 

With a similar argument for the set {i : a* < 0}, we reach the bound 
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(183) 2Err(Ext, A: — 1) + (1 — a+ — a_)Err(Ext, n)| 

(184) <6-2"“*'Err(Ext,A:-l). 

From this we conclude the claim 

(185) Err^i (Ext, k) < 6 ■ 2^-^Err(Ext, A: - 1). 


Lemma E.l (Grothendieck’s inequality). For any real matrix {Aij}, we have 

(186) 
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